fix: lazy store init — MCP server starts in <1s (v2.38.1)#20
Merged
Conversation
… <1s (v2.38.1) build_runtime() did ~20s of synchronous ML init (chromadb VectorStore + EmbeddingIndex construction + Ollama probes) before the MCP handshake, exceeding Claude Code's default MCP startup timeout and marking c3 "× Failed to connect". Both stores now initialize their chromadb/Ollama backends lazily on first use via a lock-guarded, idempotent _ensure_ready(); work methods self-ensure, status views (ready/vector_enabled/get_stats) stay non-blocking. The MCP lifespan warms them in the background (and no longer gates the build on a synchronous .ready check, which would otherwise re-block the handshake). build_runtime drops from ~20s to ~0.26s; no MCP_TIMEOUT override needed. Bumps version to 2.38.1 and documents this plus the Windows cp1252 UTF-8 banner fix (#19) in the changelog. 427 tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR targets MCP server startup reliability by moving chromadb/Ollama initialization out of the MCP handshake path via lazy, lock-guarded backend initialization in the vector and embedding stores, plus background warm-up in the MCP server lifespan.
Changes:
- Make
VectorStoreandEmbeddingIndexdefer heavy backend initialization until first real work call (with idempotent locking) and add explicitwarm()helpers. - Adjust MCP server lifespan/background tasks to build/warm stores without synchronously gating startup.
- Add regression tests for “construct is cheap; first work initializes”, and bump version/docs for v2.38.1.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
tests/test_lazy_store_init.py |
Adds regression tests ensuring store construction doesn’t trigger heavy backend init. |
services/vector_store.py |
Defers chromadb/Ollama init + fallback load to _ensure_ready(); adds warm() and self-initializing work methods. |
services/embedding_index.py |
Defers chromadb/Ollama init + hash load to _ensure_ready(); adds warm() and self-initializing work methods. |
cli/mcp_server.py |
Moves embedding build + vector store warm-up into background threads so handshake isn’t blocked. |
pyproject.toml |
Bumps project version to 2.38.1. |
cli/c3.py |
Bumps CLI __version__ to 2.38.1. |
CHANGELOG.md |
Documents v2.38.1 startup reliability fixes. |
Comment on lines
+45
to
+66
| # Heavy backend init (chromadb import/client + ollama probe) and hash | ||
| # load are deferred to first use so build_runtime stays fast and the MCP | ||
| # handshake doesn't time out. See _ensure_ready(). | ||
| self._initialized = False | ||
| self._init_lock = threading.Lock() | ||
|
|
||
| # ── Backend init ────────────────────────────────────── | ||
|
|
||
| def _ensure_ready(self): | ||
| """Lazily init chromadb/ollama backends + file hashes on first use. | ||
|
|
||
| Deferred from __init__ so build_runtime (and the MCP handshake) stays | ||
| fast. Idempotent and thread-safe via double-checked locking. | ||
| """ | ||
| if self._initialized: | ||
| return | ||
| with self._init_lock: | ||
| if self._initialized: | ||
| return | ||
| self._init_backends() | ||
| self._load_hashes() | ||
| self._initialized = True |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Release v2.38.1 — startup reliability fixes.
The C3 MCP server was intermittently marked "× Failed to connect" because
build_runtime()did ~20s of synchronous ML init before the MCP handshake:VectorStoreandEmbeddingIndexeach eagerly constructed a chromadbPersistentClient(+ firstimport chromadb) and probed Ollama. That exceeded Claude Code's default MCP startup timeout.Fix
VectorStore/EmbeddingIndexdefer chromadb/Ollama init into an idempotent_ensure_ready()(double-checked lock). Work methods (add/search/delete,build/search) self-ensure on first use; status reporters (ready/vector_enabled/get_stats) stay non-blocking. Construction is now cheap, so consumers (MemoryStore,MetricsCollector,SessionPreloader) are unaffected.lifespanno longer gates on a synchronous.readycheck (which would re-block the handshake); it warms both stores in the background instead.build_runtimestartup: ~20s → ~0.26s. NoMCP_TIMEOUToverride needed.Changelog also documents the Windows cp1252 UTF-8 banner crash fix (#19).
Testing
tests/test_lazy_store_init.py(laziness is structural:_initializedflips on first work call, not on construct/status).build_runtime('.')measured at 0.26s (was ~20s).cli.mcp_serverimports cleanly.🤖 Generated with Claude Code